29 research outputs found
Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer Control
Building agents using large language models (LLMs) to control computers is an
emerging research field, where the agent perceives computer states and performs
actions to accomplish complex tasks. Previous computer agents have demonstrated
the benefits of in-context learning (ICL); however, their performance is
hindered by several issues. First, the limited context length of LLMs and
complex computer states restrict the number of exemplars, as a single webpage
can consume the entire context. Second, the exemplars in current methods, such
as high-level plans and multi-choice questions, cannot represent complete
trajectories, leading to suboptimal performance in tasks that require many
steps or repeated actions. Third, existing computer agents rely on
task-specific exemplars and overlook the similarity among tasks, resulting in
poor generalization to novel tasks. To address these challenges, we introduce
Synapse, featuring three key components: i) state abstraction, which filters
out task-irrelevant information from raw states, allowing more exemplars within
the limited context, ii) trajectory-as-exemplar prompting, which prompts the
LLM with complete trajectories of the abstracted states and actions for
improved multi-step decision-making, and iii) exemplar memory, which stores the
embeddings of exemplars and retrieves them via similarity search for
generalization to novel tasks. We evaluate Synapse on MiniWoB++, a standard
task suite, and Mind2Web, a real-world website benchmark. In MiniWoB++, Synapse
achieves a 99.2% average success rate (a 10% relative improvement) across 64
tasks using demonstrations from only 48 tasks. Notably, Synapse is the first
ICL method to solve the book-flight task in MiniWoB++. Synapse also exhibits a
53% relative improvement in average step success rate over the previous
state-of-the-art prompting scheme in Mind2Web.Comment: 22 pages, 7 figure
Market-GAN: Adding Control to Financial Market Data Generation with Semantic Context
Financial simulators play an important role in enhancing forecasting
accuracy, managing risks, and fostering strategic financial decision-making.
Despite the development of financial market simulation methodologies, existing
frameworks often struggle with adapting to specialized simulation context. We
pinpoint the challenges as i) current financial datasets do not contain context
labels; ii) current techniques are not designed to generate financial data with
context as control, which demands greater precision compared to other
modalities; iii) the inherent difficulties in generating context-aligned,
high-fidelity data given the non-stationary, noisy nature of financial data. To
address these challenges, our contributions are: i) we proposed the Contextual
Market Dataset with market dynamics, stock ticker, and history state as
context, leveraging a market dynamics modeling method that combines linear
regression and Dynamic Time Warping clustering to extract market dynamics; ii)
we present Market-GAN, a novel architecture incorporating a Generative
Adversarial Networks (GAN) for the controllable generation with context, an
autoencoder for learning low-dimension features, and supervisors for knowledge
transfer; iii) we introduce a two-stage training scheme to ensure that
Market-GAN captures the intrinsic market distribution with multiple objectives.
In the pertaining stage, with the use of the autoencoder and supervisors, we
prepare the generator with a better initialization for the adversarial training
stage. We propose a set of holistic evaluation metrics that consider alignment,
fidelity, data usability on downstream tasks, and market facts. We evaluate
Market-GAN with the Dow Jones Industrial Average data from 2000 to 2023 and
showcase superior performance in comparison to 4 state-of-the-art time-series
generative models
Offline Equilibrium Finding
Offline reinforcement learning (Offline RL) is an emerging field that has
recently begun gaining attention across various application domains due to its
ability to learn behavior from earlier collected datasets. Using logged data is
imperative when further interaction with the environment is expensive
(computationally or otherwise), unsafe, or entirely unfeasible. Offline RL
proved very successful, paving a path to solving previously intractable
real-world problems, and we aim to generalize this paradigm to a multi-agent or
multiplayer-game setting. Very little research has been done in this area, as
the progress is hindered by the lack of standardized datasets and meaningful
benchmarks. In this work, we coin the term offline equilibrium finding (OEF) to
describe this area and construct multiple datasets consisting of strategies
collected across a wide range of games using several established methods. We
also propose a benchmark method -- an amalgamation of a behavior-cloning and a
model-based algorithm. Our two model-based algorithms -- OEF-PSRO and OEF-CFR
-- are adaptations of the widely-used equilibrium finding algorithms Deep CFR
and PSRO in the context of offline learning. In the empirical part, we evaluate
the performance of the benchmark algorithms on the constructed datasets. We
hope that our efforts may help to accelerate research in large-scale
equilibrium finding. Datasets and code are available at
https://github.com/SecurityGames/oef
EarnHFT: Efficient Hierarchical Reinforcement Learning for High Frequency Trading
High-frequency trading (HFT) uses computer algorithms to make trading
decisions in short time scales (e.g., second-level), which is widely used in
the Cryptocurrency (Crypto) market (e.g., Bitcoin). Reinforcement learning (RL)
in financial research has shown stellar performance on many quantitative
trading tasks. However, most methods focus on low-frequency trading, e.g.,
day-level, which cannot be directly applied to HFT because of two challenges.
First, RL for HFT involves dealing with extremely long trajectories (e.g., 2.4
million steps per month), which is hard to optimize and evaluate. Second, the
dramatic price fluctuations and market trend changes of Crypto make existing
algorithms fail to maintain satisfactory performance. To tackle these
challenges, we propose an Efficient hieArchical Reinforcement learNing method
for High Frequency Trading (EarnHFT), a novel three-stage hierarchical RL
framework for HFT. In stage I, we compute a Q-teacher, i.e., the optimal action
value based on dynamic programming, for enhancing the performance and training
efficiency of second-level RL agents. In stage II, we construct a pool of
diverse RL agents for different market trends, distinguished by return rates,
where hundreds of RL agents are trained with different preferences of return
rates and only a tiny fraction of them will be selected into the pool based on
their profitability. In stage III, we train a minute-level router which
dynamically picks a second-level agent from the pool to achieve stable
performance across different markets. Through extensive experiments in various
market trends on Crypto markets in a high-fidelity simulation trading
environment, we demonstrate that EarnHFT significantly outperforms 6
state-of-art baselines in 6 popular financial criteria, exceeding the runner-up
by 30% in profitability
Annealing tunable charge density wave order in a magnetic kagome material FeGe
In the magnetic kagome metal FeGe, a charge density wave (CDW) order emerges
inside the antiferromagnetic phase, providing a fertile playground to
investigate the interplay between charge and magnetic orders. Here, we
demonstrate that the CDW order, as well as magnetic properties, can be
reversibly tuned on a large scale through post-growth annealing treatments. The
antiferromagnetic and CDW transitions vary systematically as functions of both
the temperature and the time period of annealing. Long-range CDW order with a
maximum and a minimum can be realized in
crystals annealed at \SI{320}{\degreeCelsius} for over 48 h. Using
magnetization and magnetostrictive coefficient measurements, it is found that
the CDW transition is rather stable against an external magnetic field and
spin-flop transition. On the other hand, the critical field for spin-flop
transition is significantly reduced in the long-range ordered CDW phase. Our
results indicate that the CDW in FeGe is immune to variations in magnetic
orders, while the magnetocrystalline anisotropy energy and the corresponding
magnetic ground state can be altered significantly by the charge order. These
findings provide crucial clues for further investigation and a better
understanding of the nature of the CDW order in FeGe.Comment: 8 pages, 4 figure
Angle dependent field-driven reorientation transitions in uniaxial antiferromagnet MnBiTe single crystal
MnBiTe, a two-dimensional magnetic topological insulator with a
uniaxial antiferromagnetic structure, is an ideal platform to realize quantum
anomalous Hall effect. However, the strength of magnetic interactions is not
clear yet. We performed systematic studies on the magnetization and angle
dependent magnetotransport of MnBiTe single crystal. The results show
that the direction of the magnetic field has significant effects on the
critical field values and magnetic structure of this compound, which leads to
different magnetotransport behaviors. The field-driven reorientation
transitions can be utilized to estimate the AFM interlayer exchange interaction
coupling and uniaxial magnetic anisotropy D. The obtained Hamiltonian can well
explain the experimental data by Monte Carlo simulations. Our comprehensive
studies on the field-driven magnetic transitions phenomenon in MnBiTe
provide a general approach for other topological systems with
antiferromagnetism.Comment: 6 figure
Multiband effects in thermoelectric and electrical transport properties of kagome superconductors VSb ( = K, Rb, Cs)
We studied the effects of multiband electronic structure on the
thermoelectric and electrical transport properties in the normal state of
kagome superconductors VSb ( = K, Rb, Cs). In all three members,
the multiband nature is manifested by sign changes in the temperature
dependence of the Seebeck and Hall resistivity, together with sublinear
response of the isothermal Nernst and Hall effects to external magnetic fields
in the charge ordered state. Moreover, ambipolar transport effects appear
ubiquitously in all three systems, giving rise to sizable Nernst signal.
Finally, possible origins of the sign reversal in the temperature dependence of
the Hall effect are discussed.Comment: 8 pages, 5 figures. To appear in New Journal of Physic